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We present a new estimator for computing free energy differences and thermodynamic expectations 
as well as their uncertainties from samples obtained from multiple equilibrium states via either sim- 
ulation or experiment. The estimator, which we term the multistate Bennett acceptance ratio (MBAR) 
estimator because it reduces to the Bennett acceptance ratio when only two states are considered, has 
significant advantages over multiple histogram reweighting methods for combining data from mul- 
tiple states. It does not require the sampled energy range to be discretized to produce histograms, 
eliminating bias due to energy binning and significantly reducing the time complexity of computing 
a solution to the estimating equations in many cases. Additionally, an estimate of the statistical un- 
certainty is provided for all estimated quantities. In the large sample limit, MBAR is unbiased and 
has the lowest variance of any known estimator for making use of equilibrium data collected from 
multiple states. We illustrate this method by producing a highly precise estimate of the potential of 
mean force for a DNA hairpin system, combining data from multiple optical tweezer measurements 
under constant force bias. 



I. INTRODUCTION 

A recurring challenge in statistical physics, compu- 
tational chemistry, and single molecule experiments is 
the collection of a sufficient amount of data to estimate 
physical quantities of interest to adequate precision. 
In computer simulations of physical or chemical mod- 
els, such quantities include potentials of mean force, 
phase coexistence curves, fluctuation or temperature- 
dependent properties, and free energy differences. In 
single-molecule experiments, these quantities might in- 
clude potentials of mean force along a pulling coordi- 
nate or the distance between fluorescence probes dur- 
ing resonant energy transfer. For all of these problems, 
collection of sufficient statistics for a reliable estimate 
often requires multiple simulations at different thermo- 
djmamic states [1] or measurements performed under 
different applied biasing potentials. In computer sim- 
ulations, multi-state techniques such as umbrella sam- 
pling [2], simulated [3] and parallel tempering [4] and 
the use of alchemical intermediates in free energy cal- 
culations can greatly aid convergence; in experiments, 
data collected under constant applied force can help 
provide adequate sampling of conformations of inter- 
est [24]. 

Even with these methods, it may require a large quan- 
tity of data to produce estimates with the desired pre- 
cision. Computing the most precise estimate possible 
from the available data can therefore be critical in al- 
lowing these quantities to be estimated with reasonable 
computational or experimental effort. While the choice 
of thermodynamic states to sample can also greatly af- 
fect efficiency, we focus here on only the problem of sta- 
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tistically efficient estimation given samples from prede- 
termined states. 

Early methods for computing free energy differ- 
ences [5, 6] or equilibrium expectations [2] relied upon 
one-sided exponential averaging (EXP), which is for- 
mally exact but does not make the most efficient use 
of data when samples from more than one state are 
available [7]. Subsequently, the Bennett acceptance ra- 
tio method (BAR) [8, 9] greatly improved upon EXP for 
the computation of free energy differences, producing 
statistically optimal estimates of free energy differences 
when two states are sampled [9] and yielding estimates 
that can be more than an order of magnitude more pre- 
cise [7] . More recently, multiple histogram reweighting 
methods [10, 11] were proposed as a way to incorporate 
data from multiple states to produce superior estimates 
of free energy differences and equilibrium expectations 
for arbitrary thermodynamic states, including states not 
sampled. 

While multiple histogram techniques — most notably, 
the weighted histogram analysis method (WHAM) [11] 
— can produce statistically optimal estimates of the dis- 
cretized densities of states [10] or histogram occupation 
probabilities [12], they have several limitations for the 
treatment of continuous systems. First, the reliance on 
energy histograms of width sufficient to contain many 
samples — often larger than many times the thermal en- 
ergy — introduces a bias that can be substantial and of- 
ten difficult to assess [13]. Second, unlike BAR, there 
are no direct expressions to estimate the statistical un- 
certainty in free energy differences or expectations ob- 
tained from WHAM. Third, application of WHAM to 
a samples collected with a biasing potential that is not 
trivially scaled by a linear field parameter requires a 
number of bins that grows exponentially in the number 
of states, making it computationally intractable for even 
modest numbers of states. While more recent maximum 
likelihood [12] and Bayesian formulations [14] mitigate 
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the memory requirements, they do not remove the his- 
togram bias effects, and introduce a costly Markov chain 
Monte Carlo sampling procedure to estimate uncertain- 
ties [14, 15]. 

Here, we use recent results from the field of statistical 
inference [16, 17, 18, 19] to construct a statistically opti- 
mal estimator for computing free energy differences and 
equilibrium expectations at arbitrary thermodynamic 
states, using equilibrium samples from multiple ther- 
modynamic states. The resulting estimator, termed the 
multistate Bennett acceptance ratio (MBAR) estimator as 
it reduces to Bennett's method when only two states are 
considered [8], is equivalent to WHAM in the limit that 
histogram bin widths are shrunk to zero but is derived 
without the need to invoke histograms. Unlike WHAM, 
this estimator provides a direct assessment of uncertain- 
ties, critical in making comparison between experiment 
and theory, and the computational expense of comput- 
ing the estimator remains modest across a wider vari- 
ety of applications. Furthermore, it can easily be ap- 
plied to data sampled from non-Boltzmann sampling 
schemes, or to the analysis of single-molecule experi- 
ments in cases where an external bias potential is ap- 
plied. 

This paper is organized as follows: Section II recapit- 
ulates the literature on extended bridge sampling esti- 
mators, used here as the basis for the MBAR estimator. 
Expressions for computing estimates of free energy dif- 
ferences (Section III) and equilibrium expectations (Sec- 
tion IV) are then provided. Finally, we illustrate the 
method in Section V by applying it to the estimation of 
the potential of mean force for a DNA hairpin system by 
combining data from multiple equilibrium optical force 
clamp experiments under different external biasing po- 
tentials. 



II. EXTENDED BRIDGE SAMPLING ESTIMATORS 

Suppose we obtain Ni uncorrelated equilibrium sam- 
ples from each of K thermodynamic states within the 
same ensemble, such as NVT, NPT, or /iVT (see Ap- 
pendix A for more information on subsampling corre- 
lated timeseries data to produce uncorrelated samples). 
Each state is characterized by a specified combination 
of inverse temperature, potential energy function, pres- 
sure, and/or chemical potential(s), depending upon the 
ensemble. We define the reduced potential function Ui{x) 
for state i to be 



,{x) = l3,[Ui{x)+piV{x) + ^lJn{x)] 



(1) 



where a; e F denotes the configuration of the system 
within a configuration space F, with volume V{x) (in 
the case of a constant pressure ensemble) and n{x) the 
number of molecules of each of M components of the 
system (in the case of a (semi)grand ensemble). For each 
state i, Pi denotes the inverse temperature, Ui{x) the 
potential energy function (which may include biasing 



weights). Pi the external pressure, and /Xj the vector of 
chemical potentials of the M system components. 

Configurations {xin}nLi from state i are sampled 
from the probability distribution 



Pi{x) ^ c^^qi{x) ; Ci = J dx qi{x) 



(2) 



where qi{x) is here nonnegative and represents an un- 
normalized density function, and is the (generally 
unknown) normalization constant (known in statistical 
mechanics as the partition function). In samples obtained 
from standard Metropolis Monte Carlo or molecular dy- 
namics simulations or from experiment, this unnormal- 
ized density is simply the Boltzmann weight qi{x) — 
exp[—Ui{x)] but may in general differ in simulations em- 
ploying non-Boltzmann weights, such as multicanonical 
simulations [20] and those using Tsallis statistics [21]. 

We wish to produce an estimator for the difference in 
dimensionless free energies 



fj fi 



Ci 
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(3) 



where the fi are related to the dimensional free energies 
Fi by fi = (3iFi, and also the equilibrium expectations 



dxpi{x) A{x) 



Jp dx A{x) qi{x) 
Jj.dxqi{x) 



(4) 



These expectations can be computed as ratios of the nor- 
malization constants if we define new functions q{x) — 
A{x)qi{x), where the q{x) no longer need be nonnega- 
tive for states from which no samples are collected [22]. 

To construct an estimator for these ratios of normal- 
ization constants, we first note the identity 



^ijQj) 



dx qi{x) 



/p dxqi{x) aij{x) qj{x) 



J^dx q,{x) 



= / dxq,{x)aij{x)qj{x) 



dx qj{x) 

r 



j^dxqj{x)aij{x)qi{x) 



/p dx qj{x) 



(5) 



which holds for arbitrary choice of functions aij {x), pro- 
vided the Ci are nonzero. 

Using this relation, summing over the index j, and 

substituting the empirical estimator J2n=i di^in) 
for the expectations (g)-, we obtain a set of K estimat- 
ing equations 

K ^ Ni K ^ Nj 



j=l n=l 



71 — 



n=l 



for i = 1,2, . . . , K, where solution of the set of equations 
for the Ci yields an estimate of the Cj from the sampled 
data determined up to a scalar multiplier. 
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Eq. 6 defines a family of asymptotically unbiased esti- 
mators parameterized by the choice of functions aij{x), 
known in the statistics literature as extended bridge sam- 
pling estimators [19]. By making the choice 



K 

a,j{x) = Nj cj^ / ^ Nk c'^^ Qkix) , 
fc=i 



(7) 



we obtain an estimator than has been proven to be op- 
timal, in the sense that it has the lowest variance for a 
large class of choices of aij{x) [19]. This estimator is 
also asymptotically unbiased and guaranteed to have a 
unique solution (up to a multiplicative constant) [19], 
and can also been derived from maximum-likelihood 
measure theoretic methods [18] or cast as a reverse lo- 
gistic regression problem [18, 23]. 

While a closed-form expression for the c; cannot be 
obtained from Eqs. 6 and 7, the Ci can nevertheless be 
easily computed by any suitable method for solving sys- 
tems of coupled nonlinear equations. A simple self- 
consistent iteration method and an efficient Newton- 
Raphson solver are described in Appendix C. 

In the large sample limit, the error in the ratios Ci /cj 
will be normally distributed [19], and the asymptotic co- 
variance matrix &ij = cov {9i, 9j), where 9i = Inci, can 
be estimated by [18] 
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where Iat is the N x N identity matrix (with N = 
^i^i Ni the total number of samples), and N = 
diag(iVi, A'2, . . . , Nk)- The superscript + denotes a suit- 
able generalized inverse, such as the standard Moore- 
Penrose pseudoinverse, since the quantity in parenthe- 
ses will be rank-deficient. W denotes the N x K matrix 
of weights 



^kC^^ qk{Xn) 



K 

E 

k=l 



(9) 



The samples are now indexed by a single index n = 
1, . . . ,N, as the association of which samples a;„ came 
from which distribution Pi{x) is no longer relevant. We 
note that this definition ensures J2n=i = 1 for all 
i ^ l,...,K and Y^f^i Nk Wni = 1 for all n = 1, . . . , iV. 
The computational cost of evaluating the pseudoinverse 
of an X matrix in computing can be reduced 
to that of computing the eigenvalue decomposition of 
a K X K matrix, and in many cases can be reduced di- 
rectly to a simpler K x K matrix (see Appendix D). 

The covariance of estimates of arbitrary functions 
4>{9i, . . . , Ok) and "^^{Oi, . . . , 9k) of the log normalization 
constants 9i can be estimated from by the expansion 



cov 



K 



d(t> - dip 



e 



'89, 



(10) 



III. FREE ENERGIES 

When configurations are sampled with Boltzmann 
statistics, where qi{x) = cxp[— Ui(a;)], Eqs. 6 and 7 pro- 
duce the following estimating equations for the dimen- 
sionless free energies 

/. = -Infx: K '"P^""'^"^"^^ (11) 

j = l „=1 JSJ^ gxp[/fe - Uk{Xjn)] 

k=l 

which must be solved self -consistency for the fi . Again, 
because the normalization constants are only deter- 
mined up to a multiplicative constant, the estimated free 
energies fi are determined uniquely only up to an addi- 
tive constant, so only differences Afij = fj — fi will be 
meaningful. 

The uncertainty in the estimated free energy differ- 
ence can be computed from Eqs. 8 and 10 as 



6'Af,, 



cov (-Incj/ci, -In Cj/ci) = - 26ij + 6^ 



Free energy differences and uncertainties between states 
not sampled are easily estimated by augmenting the 
set of states with additional reduced potentials Ui{x) 
with the number of samples Ni = 0. For these unsam- 
pled states, no additional self -consistent estimation is re- 
quired, so many such states can be estimated very effi- 
ciently. 



IV. EQUILIBRIUM EXPECTATIONS 

The equilibrium expectation of some mechanical ob- 
servable A{x) that depends only on configuration x 
(and not momentum) is given by Eq. 4, and can be com- 
puted as a ratio of normalization constants CA/ca by 
defining two additional "states" characterized by the 
fimctions [22] 

qaIx) = A{x) q{x) ; Qaix) = q{x) 

where again q{x) = cx-p[—u{x)] if the expectation with 
respect to the Boltzmann weight is desired. Even though 
qA{x) may no longer be strictly nonnegative, we can 
still make use of the extended bridge sampling estima- 
tor (Eq. 6) to estimate the expectation (A) since Na = 

Na = 0. 

Similarly we augment the matrix W (Eq. 9) with 
columns WnA and Wna corresponding to qAix) and 
qaix), respectively: 



.-1 A{x„) exp[-u{xn)] 
J2 Nk exp[/fc - Uk{xn)] 

k=l 

.-1 exp[-u{xn)] 

"a ^ 

J2 Nk exp[fk - Uk{xn)] 

k=l 



(12) 
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where the normalization constants ca and Ca are defined 
in terms of self -consistent estimating equations as 



JV 



A{x.a) cxp[-M(a;„)] 



^T. K 

E exp[/fc - Ufc(a;„)] 



k = l 



exp[-u(a;„)] 



JV 

E 

E ^fc exp[/fe - Ufc(a;„)] 



(13) 



We can then write the estimator of the expectation as 



A = 



CA 



N 



and an estimator for the uncertainty as 

5"^ A EE COV (C^/Ca, Ca/Co) = A^ {Q AA + Oaa - 2 Q Aa) 

where the covariance matrix is now computed from 
the augmented W. Covariances between estimates of 
{A) at different thermod5mamic states, or between two 
observables {A) and {B), can also be constructed by 
adding the appropriate columns to the covariance ma- 
trix and applying Eq. 10 to estimate the desired uncer- 
tainty. If the dimensionless free energies fi have already 
been determined, computation of A for any A{x) and 
any q{x) does not require additional iterative solution 
of the self -consistent estimating equations. 

V. APPLICATION TO LABORATORY EXPERIMENTS 

The MBAR estimator is not limited in application to 
data produced from simulation — it can also be applied 
to combine data from multiple equilibrium experiments 
in the presence of externally-applied fields. To illus- 
trate, we estimate potential of mean force (PMF) of a 
DNA hairpin attached by dsDNA linkers to glass beads 
along the distance between the beads. The collection 
of equilibrium trajectories under a variety of constant 
force loads (corresponding to a linear external poten- 
tial along the extension coordinate) for the DNA hair- 
pin system 20R55 / 4T collected by an optical double trap 
experiment [24] we reported earlier [25]. The complete 
dataset was obtained from Michael Woodside (National 
Institute for Nanotechnology, NRC and Department of 
Physics, University of Alberta), and consists of 16 tra- 
jectories at 296.15 K, each 5 s in duration and sampled 
with a period of 0.1 ms, totaling 50 000 samples each. 
Each trajectory was collected under a different constant 
force load, with force loads ranging from 12.35 pN to 
14.41 pN, with an estimated 10% relative error in the 
measurement of this force value. 

The data were analyzed to produce an optimal es- 
timate of the PMF under a force load of 14.19 pN, a 



force load where it is difficult to determine the entire 
PMF to high precision from equilibrium trajectories at 
this force load alone (Figure 1). The sampled extension 
range was divided up into 50 unequally-sized bins such 
that the number of samples per bin was equal in order to 
avoid regions with zero histogram counts, as would oc- 
cur with equally-spaced bins. Analysis with the MBAR 
estimator took 18 s on a standard 2.16 GHz Intel Core 2 
Duo MacBook Pro, and the resulting error bars are more 
than an order of magnitude smaller than those derived 
from the single trajectory at this force load in the poorly 
sampled region of the PMF. Below, we describe how this 
analysis was performed with both types of analysis. 

To estimate the potential of mean force from the 14.19 
pN trajectory alone (black filled squares in Figure 1), the 
total number of counts Ni per histogram bin was deter- 
mined, and the reduced potential of mean force (in units 
of kT) fi computed up to an irrelevant additive constant 
from 



/, = -\n{N,/wi) 



(14) 



where wi is the relative width of bin i, necessary to cor- 
rect for the nonuniform bin sizes. The statistical uncer- 
tainty in the histogram count was estimated by standard 
methods (see Eq. 26 of [26]) 



5''N., = gN,{l-N,/N) 



(15) 



where g is the statistical inefficiency of the extension 
time series, estimated from the extension autocorrela- 
tion function (see Section 5.2 of [26]). 

To estimate the potential of mean force using the 
MBAR estimator, the dataset was first subsampled with 
an interval equal to the statistical inefficiency of each tra- 
jectory at constant force to produce a set of uncorrelated 
samples. The reduced potential energy for each state k 
linder the experimental conditions corresponds to 



u^(x) = /3[C/o(a;)+py(a;) + C/r*(a3)] 



(16) 



where Uq(x) is the (unknown) potential energy function 
of the system in the absence of an externally-applied bi- 
asing potential, 'pVix) is the pressure-volume work, and 
U'^'^ix) is the (known) externally applied biasing poten- 
tial, given by 



(x) 



-Fkz{x) + flfe 



(17) 



where z{x) is the extension coordinate, Fk is the con- 
stant applied force along the positive z direction, and Ofe 
is a constant offset. 

Because only differences Ui{x) — Uj{x) appear in the 
estimating equations (Eq. 11), the unknown components 
of the reduced potential energy cancel out and need not 
be considered 



Ui{x) — Uj(x) 



-I3{F, ~ F,)z{x) + I3{a^ - aj) (18) 
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FIG. 1: Potential of mean force of DNA hairpin and dsDNA 
handles system 20R55/4T under 14.19 pN external force. 
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While the constant term /3{ai — aj) involving the un- 
known zero potential intercepts will appear in the es- 
timated state free energies, these do not influence com- 
puted expectations in Eq. 14, and are hence irrelevant. 

The probability of finding the system in a bin i under 
the conditions of interest is given by the expectation 

P^ = {X^i^)) (19) 

where is an indicator function that assumes the 

value of 1 if the system is in bin i and zero otherwise. 
The potential of mean force (in units of thermal energy 
kT) can then be computed from the pi, up to an irrele- 
vant additive constant, as 

/, = -ln{p,/w,) (20) 

and the uncertainties propagated by Eq. 10. 

Because each potential of mean force is only deter- 
mined up to an arbitrary additive constant, the mean 
value of each PMF was subtracted before plotting. This 
is equivalent to choosing the additive constants so as to 
obtain an optimal least-squares RMS fit between the two 
PMFs. 

It should be noted that these PMF corresponds to the 
potential of mean force along the entire system con- 
nected to the glass beads, which includes not only the 
DNA hairpin but the two dsDNA linkers and their at- 
tachments to the glass beads. In other work, deconvo- 
lution or related methods have been applied to correct 
for the stretching of the linkers to estimate PMF for the 
DNA hairpin alone [27]. 

VI. DISCUSSION 

The MBAR estimator presented here provides a rapid 
and robust way to extract estimates of free energy differ- 
ences and equilibrium expectations from multiple equi- 
librium samples of different thermodynamic states in a 



statistically optimal way. As the estimator is asymptoti- 
cally efficient among a wide class of "bridge sampling" 
estimators [19], which includes EXP and BAR as mem- 
bers, the resulting estimates from MBAR will have the 
lowest (or equal) variance in the large sample limit. 

While multiple histogram techniques [10, 11, 12, 14] 
have been widely used for combining data from multi- 
state simulations, the MBAR estimator supplants these 
methods in the majority of cases. Most importantly, it 
provides a reliable and inexpensive method for estimat- 
ing the uncertainties in the resulting estimates and their 
correlations, which are critical for propagating uncer- 
tainties to quantities of interest. Additionally, the elim- 
ination of histograms avoids both the bias arising from 
discretization of continuous energies as well as the com- 
putational overhead of constructing and storing high di- 
mensional histograms. 

In this framework, multiple histogram reweighting 
methods such as WHAM can be understood as a 
histogram kernel density estimator approximation to 
MBAR. In some applications, histograms can reduce the 
computational expense required for solving the estimat- 
ing equations (Eq. 6) at the expense of introducing bias. 
When samples are distributed according to the Boltz- 
mann weight, the estimator for the free energies (Eq. 11) 
is precisely Eq. 21 of [11] or Eq. 15 of [28], in both cases 
presented as a reduction of the histogram bin width to 
zero in the standard WHAM equations (Eqs. 19-20 of 
[11]). While the validity of this limit is dubious — the 
derivations in these references rely upon an estimate of 
the uncertainty in each histogram count which cannot 
be correct when the bins are nearly empty — the deriva- 
tion of this equation from the extended bridge sampling 
estimator demonstrates for the first time that these equa- 
tions are, in fact, asymptotically unbiased estimators of 
the true free energy differences. 

The MBAR estimator also can be considered a mul- 
tistate generalization of Bennett's acceptance ratio esti- 
mator (BAR) [8]. In deriving BAR, Bennett constructed 
an estimator from Eq. 5 directly, determining the single 
a{x) which minimized the variance of the estimator of 
the free energy difference between only two states. In 
deriving MBAR, summing over all states j and deter- 
mining the functions aij{x) that minimizes the covari- 
ance matrix of the estimator for ratios of normalization 
constants produces an optimal estimator for the multi- 
state case. A proof of the equivalence of MBAR and BAR 
for two states can be found in Appendix E. 

BAR and a recent pairwise multistate generalization 
(which we shall refer to as PBAR) [29] differ from MBAR 
in that they can also be applied to nonequilibrium work 
measurements between pairs of states, in addition to 
equilibrium reduced potential differences {instantaneous 
work measurements). However, PBAR constructs a to- 
tal likelihood function from products of likelihood func- 
tions connecting pairs of states, assuming independence 
of all work measurements. For equilibrium samples, 
this means that a sampled configuration a;„ from a state 
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i can only be used to provide information about the in- 
stantaneous work required to switch to a single other 
state i for use in the PBAR estimator, whereas in MBAR, 
each sampled cc„ can be used to provide information 
about all states. As a result, MBAR should require sig- 
nificant fewer samples from each state to produce an es- 
timate of equivalent precision with equilibrium data. 

A Python implementation of the MBAR estimator de- 
scribed here is available under the GNU General Public 
License (GPL), and is provided online, along with sev- 
eral example applications, at https://simtk.org/ 
home/ pymbar. 
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APPENDIX A: CORRELATED TIME SERIES DATA 



While estimating equations based on Eq. 6 can be applied to correlated or imcorrelated datasets, provided that the 
empirical estimator 



remains asymptotically unbiased, the asymptotic covariance matrix estimator (Eq. 8) only produces sensible esti- 
mates when applied to uncorrelated datasets. Application to correlated datasets may produce severe underestimates 
of the true statistical uncertainty, and should be avoided. 

A set of uncorrelated configurations can be obtained from a correlated time series, such as is generated by a molec- 
ular dynamics or Metropolis Monte Carlo simulation, by subsamplrng the time series with an interval approximately 
equal to the equilibrium relaxation time for the system. Because the equilibrium relaxation time is difficult to com- 
pute for all but the simplest systems, we find the maximum of the statistical inefficiency g computed for several 
relevant observables (such as the reduced potential Uk{x) in Boltzmann- weighted sampling, structural observables 
A{x) in the computation of potentials of mean force, etc.) provides a practical estimate useful for subsampling. 

The statistical inefficiency qa of the observable A{x) of a time series {xt}J^i is formally defined as (see Janke [30] 
for a detailed exposition) 

gA = 1 + 2ta 

T 

TA = E (1 - t/T) CAA{t) 
t=l 

CaaH) . - (A)^. 

where ta denotes the integrated autocorrelation time and CAA{t) the normalized fluctuation autocorrelation function 
of the observable A. 

Direct application of these equations substituting the empirical estimator for the expectation can be problematic 
due to statistical noise. As a result, there exist a number of standard procedures [26, 30, 31, 32] to improve the quality 
and stability of this estimate for physical systems, making use of properties such as stationarity. The fast method for 
estimating the integrated autocorrelation time described in Section 5.2 of Chodera et al. [26] is implemented in the 
Python implementation of MBAR available online. 
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APPENDIX B: RECOMPUTATION OF REDUCED ENERGIES AT MULTIPLE STATES 



Application of MBAR to simulation data requires Uk{xn) to be evaluated for all K reduced potential functions 
Uk{x) and all N uncorrelated sampled configuration cc„, a total of KN reduced potential evaluations. In practice, 
this is not overly burdensome; the samples a;„ are generally produced by schemes that generate chains of highly 
correlated samples, such as molecular dynamics or Monte Carlo simulations. Once the stored configurations are sub- 
sampled to eliminate correlations and produce an effectively uncorrelated sample (as described above), the number 
of remaining samples N « T / g is generally smaller than the number of samples T produced during the simulation 
by one or more orders of magnitude. 

In cases where the Uk{x) differ only by a linear scaling parameter of one or more components (such as temperature 
or an external field parameter), computation of Uk{xn) for all K states is a trivial operation. For other cases, such as 
when all samples are collected from thermodynamic states that only differ in the external biasing potential (e.g. linear 
or harmonic), we note that the reduced energy differences Uk{x) — Ui{x) involve only differences in the external 
biasing potential, which can often be rapidly computed. Section V contains an illustration of this in application to 
single-molecule pulling experiments. 

APPENDIX C: EFFICIENT SOLUTION OF THE ESTIMATING EQUATIONS 

A number of methods can be used to obtain a self-consistent solution to the free energy estimating equations 
obtained from combining Eqs. 6 and 7 

K Nj 

^^ = T.T. K ' — • (CD 

k=l 

or, in terms of the dimensionless free energies fi = — In 

K 

= -i^EE K ' — • (C2) 

k=l 

While any method capable of solving a coupled set of nonlinear equations may be employed here, we describe some 
practical choices we made in the implementation of this algorithm. While any method capable of solving a coupled 
set of nonlinear equations may be employed, we describe two approaches to their solution: a straightforward yet 
reliable self -consistent iteration method and an efficient yet slightly less reliable Newton-Raphson method. Both 
methods are implemented in the Python implementation of the estimator available online. 



1. Self-consistent iteration 

As in [11], the fi could be obtained by self-consistent iteration of Eq. 11 using the last set of iterates {fi"^}iLi to 
produce a new estimated set of iterates 

A,-l 

Convergence is assured regardless of the initial choice of ff^\ so it is sufficient to initialize the iteration by choosing 

ff^ = 0. Alternative initial choices of the initial reduced free energies ff''' may speed convergence. For example, we 
have found the choice 

/r = -j^Y^^^qM (C4) 

n— 1 

which, for Boltzmann weighting {qk{x) = exp[—Uk{x)]) corresponds to the average reduced potential energy, usu- 
ally works well. Additional inexpensive choices are possible, such as fixing /}"^ — and estimating consecutive 
differences ifj^^_li — /f,*^^)/ k = 1,2, . . . , K — 1 using the Bennett acceptance ratio (BAR) estimator [8, 9]. 
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a. Cautions 

For numerical reasons, it is convenient to constrain /i = during the course of iteration by subtracting /i from 
the updated values in order to obtain a unique solution and prevent uncontrolled growth in the magnitude of the 
estimates. Iteration is terminated when the quantities of interest change by a fraction of the desired precision with 

additional iterations, but a convenient rule of thumb is to terminate when maxi=2,...,A' l/f"^^'' ~ // l/l/f"'' I < 10^^. 
Because the quantities of interest and the relative free energies can converge at different rates, it is advised that the 
former be monitored when possible. 

It is also critical to avoid overflow in the computation of exponentials e". To compute log sums of the form 
In ^6 can use the equivalent form 

N N 

In exp[a,i] = c + In exp[a„ — c] (C5) 

Ti— 1 n— 1 

where c = niax„ an- To minimize underflow, the terms cxp[a„ — c] can be summed in order from smallest to largest. 

2. Newton-Raphson 

A more efficient approach to determination of the fi is to employ a Newton-Raphson solver, which has the ad- 
vantage of quadratic convergence (a near doubling of the number of digits of precision) with each iteration when 
sufficiently near the solution. Because each iteration requires inversion of a {K — 1) x [K — 1) matrix, this approach 
is only efficient if K is small, say K < 100, but this will be satisfied in a wide number of cases. 

First, we write the estimating equations in terms of a set of functions gi{9) such that the solution of the estimating 
equations (Eq. C2) corresponds to g(^) — 0. Several such choices of both the function g(0) and the parameterization 
(the normalization constants Ci or their logarithms 9i) are possible, and the efficiencies of approaches based on 
different choices may differ substantially, but we find it convenient to choose 

N 

g,{9) = N,~N,Y,Wm{0)- (C6) 

n=l 

where Wni is defined in Eq. 9. It can easily be seen that g{6) = is equivalent to the estimating equations: 

N 

^ 7V,-7V^^W'„,W = 

n=l 

^kCf.^ qk{Xn) 

k=l 

^ c, = ^'^"^"^ (C7) 

k=l 

In Newton-Raphson, the function g{0) is expanded about the current iterate S*-"' to first order: 

g(0("+i)) w g(6>(")) + H(6»("')(6>("+i) -6>(")). (C8) 

where 



N 

- E N^Wn^{l-N,Wn^) if i = j 

^"=1 (C9) 

n=l 
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We seek the next iterate 0^"+^') such that g{6^"^^^) = 0, which yields the update equation 

Qin+i) ^ -y[H(6»("))]+g(6/(")) (CIO) 

where + denotes the pseudoinverse. If all the qi{x) are unique and Ni > for all states, the standard matrix inverse 
may be substituted for the pseudoinverse. 

a. Cautions 

We only need to iterate over states for which iV^ > 0; the relative free energies of states where Ni = 0, and 
expectation values at all states, can be determined after the self-consistent equations are solved to determine the 
relative free energies of states where A'^^ > 0. Since we must constrain /i = to avoid drift during the process of free 
energy determination, we can simply use a modified form of Eq. CIO where rows and columns corresponding to the 
first state are omitted. 

^(2S:if) = 7[H(0("))(2:K,2:K)]+g(e<"^)(2:K,2:K). (Cll) 

7 G (0, 1] is a scalar multiplier that controls the rate of convergence. Since the initial iterate 0*^^^ may be far from 
the realm of quadratic convergence (i.e. outside the range at which the Taylor expansion in Eq. C8 holds), it is often 
safer to choose an initial 7^1. We have found 7 = 0.1 works well for the first step, with 7=1 used thereafter. 

Even then, there are times when with reduced 7 does not prevent numerical instability. The instability may be 
due to the initial guess iterate 0'"-' being too far from the region of quadratic convergence, such that the first-order 
Taylor expansion above is a poor approximation to g(0) in Eq. C8. In this case, a better procedure for choosing 
the initial iterate may aid convergence. Starting with one or more iterations of the self-consistent method (Section 
CI), or using an initial estimate from application of BAR [9] to sequential states may be sufficient. Less commonly, 
failure to converge may be result from numerical precision limiting the accurate calculation of the pseudoinverse 
[li{9^"'^)^2:K,2:K}]^- 1^1 all cascs, we find that self -consistent iteration still works reliably to recover the estimator, and 
can be used as a fallback procedure. 



APPENDIX D: EFFICIENT COMPUTATION OF THE ASYMPTOTIC COVARIANCE MATRIX 

1. Singular value decomposition 

The N X K matrix W (Eq. 9 in the main paper) can be written in terms of its singular value decomposition 

W = USV'^ (Dl) 

where U is an iV x unitary matrix of left singular vectors (such that UU^ = Ijv), T,is an N x K matrix containing 
L < K singular values along the diagonal, and Y \sa K x K imitary matrix of right singular vectors. 
The estimator for the asymptotic covariance matrix (Eq. 8) can then be expanded to 

e = w'^(i7v - wNw'^)+w 

= (USV^)'^[lAr - (USV'^)N(USV^)'^] + (UEV'^) 
= VS'^U'r[Ijv - USV'^NVE'^U'^]+USV'^ 
= VS'^U'^[UU'r - USV'^NVS'^U'^]+USV^ 
= VS'^U'^[U(lAr - I]V'^NVE'^)UT]+USV'^ 
= VS'^U'^U[Ijv - SV'^NVS'^l+U'^USV'^ 
= VS'^[lAr - SV'^NVS'^] + SV^ 

(D2) 

We partition the matrix of singular values S into aKx K diagonal region E/< (of which only the first L < K diagonal 
entries will be nonzero) and an {N — K) x K zero matrix 0: 

S = \^^] (D3) 
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We can then rewrite the above expression as 





Ik 






{ 


I(W-A') 








= V [ ] 



V'^NV[Sif 0] 



r 


■ Sa " 








(D4) 



We note that pseudoinversion of the quantity in brackets now only requires 0{K^) work, though this can be further 
reduced to 0{L^) work if the reduced SVD is used. 

The singular values S/^ and matrix of right singular vectors V can easily be computed from the eigenvalue de- 
composition of W^W: 



W'^W = (USV'^)'^(USV^) 
= V(S'^S)V'^ 



(D5) 



2. When W has full column rank 



In the case that W has full column rank (because all qk{x), k = \,. . . ,K are unique) we can make further progress. 
Using Eq. D4, we can write 



= [VS^.1(I;^-Sa-VTnVSk + 11^)Sk^VT]-i 
= [V(S^2)VT-N] + 
= [(W^W)-i -N] + 



(D6) 



We note that W^Ia, = Ik, and WNl ^ = l^v, and so W^WNl n = '^k, and observe that [(W^W)-! - N] has rank 
K — 1 with kernel Ik 



[(W^W)-^ - N]1k 



(W'^W)"iw'^WNlif - NIk 

Nl^-Nl^ 





We can supplement the quantity in brackets with 61a- 1 J-, where b is some nonzero scalar, without changing the 
covariance values computed from it, and make it invertible: 



= [(W'^W)-i -N + WaIk]"^ 
We choose b — N^^ to ensure the inversion is well-conditioned (as in [18]), producing 

= [{W^W)-^ -N + 1k1k/N]-\ 



(D7) 



(D8) 
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APPENDIX E: EQUIVALENCE OF MBAR AND BAR FOR TWO STATES 



We start with Eq. 11. For ease of use, we define A/ ~ J2 — fi and Au{x) = U2{x) — ui{x) and M — \iiN2/Ni 
Without loss of generalization, since the equations are symmetric, we examine the self -consistent equation for /i . 



exp[-ui(a;j„)] 



2 N, 



fe=i 



CXp[/i - Ui{xir. 



\ Ni exp[/i - Ui{xi„)] + N2 exp[/2 - U2{xin)] 



exp[/i - Ui{x2n)] 



N2 

V . 

iVi exp[/i - Ui{x2n)] + N2 exp[/2 - U2{x2n)] 
Ni N2 



E 



1 



E 



iVi + iV2 exp[A/ - Au(a;i„)] A^i + TVs exp[A/ - Au(a;2„)] 



[A/ - Au{xi^)] ^^^1 + ^ exp[A/ - A^/(a;2„)] 

1 



1 



. , ■ exp[M + A/ - Au(£c 
We make the additional observation that 



i«)] „=i 1 ■ 



exp[M + A/ - Au(a;2n)] 



(El) 



1 

1 + exp(a;) 



1 



which allows us to write Eq. El as: 



Ni 

0-E 



1 + exp[M + A/ - Au(a;i„)] 



- 1 



1 + cxp(— 



N2 

E 



1 



5r 



1 



Af2 



exp[M + A/ - Au( 



E- 



1 + exp[M + A/ - Au(a;2n) 
1 

exp[-M - A/ + Aw(a;2„)] 



Which is exactly the equation for BAR presented in Shirts et al. [9]. 

We now examine the expression for the variance limited to two states. When the two thermodjmamic states are 
not identical, the W will have full rank, and the asymptotic covariance matrix can be written as (see Eq. D7 above): 



e = [(wTw)-i-n + 1k1k/^]^' 



where we have from Eq. 9 



exp(/i - Ui{Xn)) 



K 



J2 Nk exp(/fc - Ukixn)) 
fc=l 



Defining F as the Fermi function, and X„ = M + Af — Au(cc„), then in the case of two states Wni = ^F{Xn) and 
Wn2 = N^^F{-X^). 
The matrix W^W can then be written as: 



N 



^,\N^'N^'F{Xn)F{-Xn) N^^F{-X^f 



(E2) 
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If we represent the matrix (W^W)y = aij, the determinant |W"''W| will be D = 011022 — a2iO'i2- The variance of 
ratios is actually independent of multiplicative factor used in front of Ik^k, as we will show below, so we will use 
b in place of 1 /N for generality. The inverse of the covariance matrix is then: 



-^ + b ^-N2 + b 



The determinant will then be: 

._ii _ 1 N2a22 - Niaii 



\&- 



D 



D 



+ N1N2 - biNi +N2) + b 



an + 0,22 + ai2 + 021 



(E3) 



However, we note that (W^W)~^ — N is singular, and thus the sum of the first three terms in Eq. E3 equals zero. 
Additionally, because it has kernel \k, it must also satisfy 022 — 021 — N\D — and on — 012 — N2D — 0. Because 
we know by symmetry that 012 = 021, which we denote by simply a, this determinant then becomes: 



I© i| = [011+022 + 012 + 021 - {N^ + N2)D\ = ^ 



We then obtain: 



[(W^W)-^-N + 61jvlj^] 



Ti-l 



D 

Id) 



^~N2 + b 



— - b 
D " 



— - 6 

D " 

^-Ni + b 



The variance in /i — /2 will be Gn + G22 — 20i2, which reduces to: 



Var(/i - /2) 



D 
4ab 
1 

D 



On - NiD + bD + 022 - N2D + bD-2a + 2bD 



D 



[(on - o - N2D) + (022 -a- NiD) + 4bD] 



Which is indeed independent of 6 7^ 0. Since 022 = iVi + o and on = A'^2 + o, given D — 011022 — o^ (as noted above), 
we can find that D ^ N^^'^N^^il ~ Na). We then obtain: 



Var(/i - /2) 



011O22 — o 
o 

l-iVo 
NiN2a 
1 



N 



1 



■ N 

^ 2 + 2cosh(X„) 



N 



NiN2a N1N2 
1 

E i^(^„)F(-x„) 

1 



N 



N1N2 



1 

N 



2 + 2 cosh(Af + A/ - Au{x)) 



N N 
N2^m 



This is the equation for the asymptotic covariance of of free energies given for the BAR method in Shirts et al. [9] 
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